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Abstract 

We study mathematical models of the collaborative solving of a two- 
choice discrimination task. We estimate the difference between the shared 
performance for a group of n observers over a single person performance. 



Our paper is a theoretical extension of the recent work of Bahrami et al. 



(2010) from a dyad (a pair) to a group of n interacting minds. We an- 
alyze several models of communication, decision-making and hierarchical 
information- aggregation . 

The maximal slope of psychometric function (closely related to the per- 
centage of right answers vs. easiness of the task) is a convenient parameter 
characterizing the decisive performance. For every model we investigated, 
the group performance turns out to be a product of two numbers: a scaling 
factor depending of the group size and an average performance. The scaling 
factor is a power function of the group size (with the exponent ranging from 
to 1), whereas the average also varies: it is arithmetic mean, quadratic 
mean, or maximum of the individual slopes. We conclude that voting can 
be almost as efficient as more elaborate communication models, given the 
participants have similar individual performances. 
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1. Introduction 



Everyone who ever took part in a group decision making or problem solv- 
ing, probably asked oneself whether it actually made any sense — wouldn't 
it be better if simply the most competent person made the choice? In differ- 
ent words, the question is whether a group can outperform its most capable 
member. There have been many studies that reported group decisions to 



be less accurate Corfman and Kahn (1995); some, however, concluded that 



groups using even simple majority voting can make better decisions, than 



their members alone Grofman (1978); Kerr and Tindale (2004); Hastie and 



Kameda (2005). We ask a more general question — how does the group 



performance depend on the individual performances of its participants, and 
the ways in which they communicate? 

This question is put in a new light by recent trends in cognitive psy- 
chology, which, after half-century long fascination with isolated cognition in 
an individual, admit its constant interaction within social environment. It 
is increasingly realised that joint action and cognition is not limited to the 
situations of committee/ voters' decisions but pervade our everyday life, re- 
quiring constant coordination and integration of cognitive and physical abili- 



ties. This trend, called distributed cognition Hutchins and Lintern (1995), or 



extended cognition Clark (2006), brings the focus of research to the mecha- 



nisms of cognitive and physical coordination Kirsh ( 2006 ) that effectuate this 



integration, as well as questions about the comparison of the performance of 
a group to the individual performance. For some tasks that require differ- 
ent types of knowledge and abilities from group participants, the groups are 
likely to outperform the individuals Hill (1982). For others, such as simple 



discrimination tasks or estimates, a question arises if indeed, and when, a 
group may be better than the best of its members. 

Group decision making obviously involves members interacting with each 
other. Casting a vote requires minimum amount of communication — one 
only needs to inform about his or hers choice. However other group decision 
situations allow extensive communication and negotiations of a decision. The 
question is: which forms of communication are most likely to bring an im- 
proved outcome, and what actually is being communicated in those groups. 
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Recent experiments by Bahrami et al. (2010) have shown that cooperation 
can be beneficial, even in case of an extremely simple task, and that this 
benefit is best explained by the participants communicating their relative 
confidences. In their study dyads (pairs) performed a perceptual two-choice 
discrimination task — on every trial the participants were to decide, which 
of two consecutive stimuli (Gabor patches) had higher contrast. First, deci- 
sions were collected from both persons; then the participants were allowed to 
communicate in order to reach a joint decision. The decision data obtained 
from a person was used to fit a psychometric function — i.e. probability 
of that person giving a certain answer, as a function of the contrast value. 
This function describes the persons skill in the considered task. Similarly, a 
function describing the skill of the group as a whole can be estimated from 
the group decisions. 

Various assumptions about the nature of the within-group interactions 
during the decision-making process can be made. From these assumptions 
we can derive theoretical relationships between the parameters of members' 
functions and the parameters of the group function — these are the models 
of decision making. The correctness of each model can then be tested against 
empirical data. 



Bahrami et al. (2010) described and evaluated four such models. One 



was his own, in which group members communicate their relative confidence 
in their individual choices. Another stems from a signal detection theory 
approach Sorkin et al. (2001) — if the members know each other's psycho- 
metric functions, the group can make a statistically optimal choice. Thus, 
under certain conditions, we have an upper bound on group performance. 
The third model stated, that the dyad is as good as its best member. Finally, 
the last model was random response selection. The study concluded that, 
when similarly skilled persons meet, they can both benefit from cooperation. 
The model in which participants communicate their relative confidences best 
explains this benefit. 

We extend the models from Bahrami et al. (2010) study to groups of n 
participants and compare their predictions. Furthermore, we add a model in 
which a participant either knows the correct answer, or guesses. In the case 
of larger groups it may be so, that only small subgroups of participants can 
communicate simultaneously. We address this issue by considering hierar- 
chical schemes of decision aggregation, in which decisions are first made by 
subgroups, then some of these subgroups interact with each other and reach 
a shared decision and so on. 

The paper is organized as follows. In Section[2]we introduce the standard 
model of the discrimination of two stimuli. We use it to assess performances 
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of individuals and groups. Section [3] contains a series of models of commu- 
nication, which express performance of a group of n persons as a function of 
their individual performances. In Section [4] we investigate how each model 
works, assuming several schemes of decision aggregation. Section [5] concludes 
introduced models and gives insight into further experimental and theoretical 
work. 



2. Model of discrimination 

Consider an experiment in which a participant has to make simple dis- 
criminatory decisions with varying difficulty Each trial is assigned a parame- 
ter c that describes physical distance between the stimuli to be discriminated 
(e.g. in the Bahrami et al. experiment it was the difference of the contrast 
between Gabor patches). Negative c describes a situation, when the right 
choice is the first one, whereas positive — the second one. The absolute 
value of c reflects the difficulty of a given trial — the lower it is, the more 
difficult the trial. 

Knowing the choices of a certain decision-making agent (in our case either 
a single participant or a group making the decision together) for a range of 
contrasts one can construct a mathematical description of its performance 
on the task. For such an agent we can determine a psychometric function 
- probability of choosing the second answer as a function of the displayed 
contrast, P{c). An ideal responder would be described by the Heaviside step 
function: P(c) = for all negative contrasts and P{c) = 1 for all positive 
contrasts. Since responders make errors, the actual functions are different. 
In particular a cumulative of the normal distribution: 

P{c) = H (£±*) , where (1) 
H(x) = -L / exp (-t 2 /2) dt, (2) 



turns out to be a good fit for the experimental data Bahrami et al. (2010 1. 
The parameter a is the width parameter — it can be seen an expression of 
the participant's uncertainty about the decision. The parameter b is the bias 
(offset) — it represents the tendency to choose a particular answer. The P(c) 
function defined as above can be seen as a convolution of the step function 
(the correct answer) and the Gaussian distribution (the discriminative error) 
— see Fig. [T] 

One of possibilities of describing such response is a signal detection theory 



Sorkin et al. (2001 ). According to it, for a stimuli with contrast c participant 
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P(c) - probability of choosing the second option 




- contrast 



Figure 1: Plot of the psychometric function, with shown slope s, a positive bias b. The 
shaded area W is proportional to the error rate of a participant. 



perceive contrast x, which is a normally distributed random variable centred 
around —b and with the variance a. Two models described in this paper (i.e. 
Weighted Confidence Sharing and Direct Signal Sharing) use this assumption 
explicitly. 

For our purposes we assume that bias is much smaller than characteristic 
width parameter, i.e. \b\ <C o. This assumption seems to be well justified for 
this and similar psychological experiments. In the case of other psychological 
experiments, the bias may be even the main (or the only one) parameter — 
e.g. the model of averaging biases in a situation when a group guesses a de- 



mographical quantity Rauhut and Lorenz (2010). Consequently, a becomes 



the main determinant of the effectiveness of discrimination. It is convenient 
to choose the maximal slope of the psychometric function 

(3) 



>2ita 

as the primary measure of responder's effectiveness. 



Now we can proceed to extending Bahrami et al. (2010) models. We 
would like to know how performance of a group of n people depends on their 
individual cognitive performances. Therefore, we need to solve the explicit 
formulas for the propagation of slopes and biases, when combining several 
responders within each of the different models of communication: 

Smodel = Smodel{sii b±, . . . , S n , b n ), (4) 
bmodel — ^moc(ez('Slj b\, . . . , S n , b n 

)• (5) 
Each model is described by the shared decision function 

Pmodel{c) = f[P 1 ( Cl ),.. .,P n (c n )] , (6) 
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where / is a functional. For all but two models P m odel{c) = f [-Pi( c )> • • • > -fn(c)], 
that is, the dependence is pointwise (i.e. result for a given c requires only 
knowing individual Pj(c) for the same c). 

We can obtain the effective slope Q and bias ([5j using the straightfor- 
ward formulas, which involve taking derivative of the psychometric function 
with respect to the contrast: 



s model — P m odel( C ) 
b-model 



b for which P mo del(-b) 



^ model ( c ) 



c=0 



Pmodelify 



^model( C ) lc=0 



(7) 
(8) 



where approximations are calculated for relatively small biases, i.e the rela- 
tive error for both s and b is of order 0(s 2 b 2 ) (or equivalently, O(^)). The 
derivation is in Appendix Appendix A| Note that if P m odel( c ) is cumula- 
tive of Gaussian function (as in ([!])) the formulas for slope ([3| and (J7J are 
equivalent. The latter, however, is valid in the general case of an arbitrary 
communication strategy P m odel( c )- 

A question arises about the relation between the psychometric curve 
parameters and the expected rate of the errors. To assess the average amount 
of wrong answers one can expect from a responder, we introduce the following 
quantity 



W{a, b) 



/0 roo 
P(c)dc+ / [1-P(c)]dc 
-oo JO 



^^{-&,)+b[2H{^)-l\ 



0) 
(10) 



where we integrated the error function Abramowitz and Stegun (1965). For 



uniform distribution of stimuli, and range of stimuli (— r, r) for r 3> (cr+ \ b\), 
the rate of the wrong responses is given by W(a, b) /(2r). The average number 
of wrong answers is always reduced, when lowering either width or bias, 
regardless of the other parameter's value. This fact further justifies the 
choice of the slope as a proper measure of the effectiveness. When there 
is no bias, (10) simplifies, i.e. W(a, 0) = 2/s thus the rate of the wrong 



responses is l/{rs). 



3. Information-sharing models 

In this section we discuss different models of information-sharing of n 
participants. It is important to underline that the models incorporate the 
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process of perceiving (what the subjects may know), the state of mind (what 
the subjects know), and the communication and the decision-making process 
(usually B ayes-optimal). We briefly define assumptions of each model and 
justify it in psychological terms. We give results in terms of the effective 
psychometric function P mo del( c ): the effective slope s mo dei arid sometimes 
the effective bias b moc iei (as for a few models the bias is poorly-defined). 
Whenever calculations of P mo del(c) are not straightforward, we give some 
insight into the underlying mathematics. 
We investigate the following models: 



3.1 Random Responder, 



3.2 Voting, 



3.3 Best Decides, 



3.4 Weighted Confidence Sharing, 



3.5 Direct Signal Sharing, 



• ELS Truth Wins. 
3.1. Random Responder 

Model. Each trial decision of a random group member is taken as the group 
decision. 

Motivation. It serves as one of the reference models and it is not expected 
to be fulfilled in most of realistic settings. Random factors determine the 
collective decision, i.e. communication is seen as ineffective within framework 
of this model. Sometimes decision is not based on any support and people 
may have very misleading impression of their own accuracy. Also, their 
decision may be depend more on one's charisma, or persuasive skills that 
the merits. In the work of Bahrami et al. this model is called 'Coin flip'. 

Results. 

n 



iy c) (11) 

n ' 

i=l 



After the differentiation one obtains the slope ([7]) and the bias (J8|: 



si + . . . + s n 

srr ~ (12) 

n 

sibi + . . . + s n b n 

b RR « 13 

st + ... + s n 



7 



The relative error both for srr and 6r/j is 0{s\b\) + 
that Prr(c) is not normal ([I]). 



+ 0{sib 2 n ). Note 



3.2. Voting 

Model. Each participant makes her or his own decision. The majority voting 
makes the decision of the group. In the case of the equal number of opposite 
opinions a coin is flipped. 

Motivation. People may have no access to their accuracy (or they cannot 
communicate it reliably), thus a good strategy is to take voting as the final 
consensus result. 



Results. 



Pvot(c) 



EE 

k=l 7 



[l-P il (c)]---[l-P ik (c)}P ik+1 (c)---P in (c) 



(14) 



+ 



1 



£[1-Pu(c)] 



1 



*n/2 



l n/2+ 



if n is even 

where sum over i denotes sum over every permutation of participants. We 



obtain (derivation in Appendix Appendix B 
ai + 

SVot 




if n is even 
if n is odd 



by 



of 



gjji + . . . + s n b r , 
si + . . . + s n 



(15) 

(16) 
(17) 



The Pvot{c) is not normal ([I]). The relative error both for sy Q t and byot 
is O(sibi) + . . . + 0(s n b n ). Note that the addition of an odd member to 
a group does not increase its average performance. The formula (16) is an 
asymptotic expression for large n, which makes use of the Wallis formula. 
For n = 2 Random Responder and Voting models give the same results. 

3.3. Best Decides 

Model. The most accurate member of the group makes decision. This model 
is called Behavior and Feedback in Bahrami et al. (2010). In this model we 
will focus on the case with no bias 6 = 0. Nonzero bias would make the 
result hard to state in explicit form — see (10). 
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Motivation. In some experimental settings members of the group can deter- 
mine, who is the best of them (e.g. when the feedback is present). Then they 
can let him/her make the final decision. Studies by Henry] (1995) suggest 



that, at least in some types of tasks, participants can identify the most pro- 
ficient member, so our assumption is plausible. As in the previous models, 
there is no (effective) communication between the members of the group. 

Results. 

Pbd{c) — -^member with the highest s( c ) (^) 
sbd = max(si, . . . ,s n ) (19) 

In the case when biases are large, the group psychometric function is that 
of the most effective participant (i.e. one with the lowest W(<Ji, b{) (|10[)), 
Pbd(c) = Pi(c). The strategy is the most beneficial for a group with very 
diverse individual performances. 

3.4- Weighted Confidence Sharing 

Model. Group members share their relative confidences Zi = Xijoi. The 
group decision depends on the sign of %i> i- e - 101 the negative they 

choose the first option and when positive — the second. This model requires 
each Pi(c) to be normal ([I]). 

Motivation. The value X{ is the contrast perceived by i-th participant and 



has the distribution with the density P/(c), as it is in Sorkin et al. (2001). 
The true contrast c is, of course, common for all participants in a given 
trial. The relative confidence is equivalent to a z-score, if the participant is 
unbiased (i.e. its related to probability that the participant is right). Put 
differently, participants know their z-scores on a given trial, but are unaware 



of their own parameters s and b. This model was first introduced by Bahrami 



et al. (2010). It is possible that in an experimental trial each participant can 
estimate and effectively communicate their relative confidence, by saying 
phrases being a coarse real-world approximation of one's z-score, e.g. 'I lean 
towards 1st', or 'I am almost sure it is the 2nd'). Bahrami's et al. study 
suggests that this model most accurately describes dyad performance. 

Given relative confidences z = (zi, . . . , z n ), the group has to determine, 
whether to choose the first or the second option. If there are only two partic- 
ipants, in the case of different opinions, the one with the stronger confidence 
(in this trial) decides. This can be written as follows: the group chooses the 
first option if z\ + Z2 < 0, the second option otherwise, yielding the optimal 
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strategy Bahrami et al. (2010). In the general case of n participants we use 
the Bayes optimal reasoning. We calculate the probability that the contrast 
is positive (and thus the second answer is correct) given z-scores provided 
by each participant: 

P (c > on = r dew* = s y { f^ , (20) 

where p{c) is probability of a discrimination task with c. Probability of 
observing z^-score given contrast c is P[(c — (JiZi), thus 

p(z\c) = P{(C - Xl) • . . . • P' n {c - X n ). (21) 

Let's assume that displayed contrast has uniform distribution, i.e. p(c) is 
constant (not going into mathematical nuances). In order to define decision 
function we need to know when p{c > 0\z) > 1/2 or, in other words, when 



the probability that the second answer is correct is greater than 1/2. As (20) 
is a Gaussian function of c, finding its maximum leads to the condition 

^ + ... + ^>0, (22) 

or equivalently, using the slope parameter, 

s\Z\ + . . . + s n z n > 0. (23) 

Thus when the condition holds, choosing the second option is the Bayes 
optimal choice. Unfortunately, in the considered model we only have access 
to values of z, not individual performances. In order to get the precise answer 
of the optimal choice we need to know the whole distribution of m (or si). 
Instead, we can use the approximate condition for the choice of the second 
option 

Zl + . • • + z n > 0, (24) 

to obtain a lower bound of the performance. The condition is exact for 
participants with equal performances (and should be close to the optimal if 
the values of G{ do not vary much). This equation can be seen as a kind 
of a weighted voting, where weights depend on subjective confidences, but 
not on individual performances. Members don't know theirs, or their peers' 
parameters, so there is no justification for assigning more or less weight to a 
particular member for the whole experiment. The only thing that matters 
is confidence in the present trial. 
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Results. To calculate Pwcs( c ) we need to count, given contrast c, probabil- 
ning set 

Pwcs(c) = I exp 



ity of obtaining set z with a positive sum ( 24 1 , thus 

(c + bi -xi) 2 



xi/ai+...+x n /o-„>0 



(c + b. 



n J 



2af 

2 1 dx\ ■ ■ ■ dx r 



+ 



2^ 



(2?r) n / 2 CTi • • • ay, 



H 



2-kswcs (c + &w/cs) 



(25) 



(26) 



where the integration bases on the fact that a sum of Gaussian random vari- 



ables Zi is a Gaussian random variable Piau (2011 1. The resulting parameters 
are: 



n x 



si + 



+ Sr. 



n 



(27) 
(28) 



SWCS = 

Slh + . . . + S n b n 

owes = , , • 

Si + . . . + s n 

Again, note that the above result for swes is t ne low boundary value for 
the optimal Bayesian reasoning, exact only for n = 2 (due to symmetry) and 
the group of participant with the same performances. Knowing the exact 
distribution of individual performances one can get a better (or at least the 
same) group performance. Then instead of the summation of individual 



z-scores (24) one will get a more complicated formula for the decision. 



3.5. Direct Signal Sharing 

Model. Group members share both their perceived contrasts Xi and their <Tj. 
The group decision depends on the sign of ^27=1 x i/ a i- This model requires 
each Pi(c) to be normal ([T]). 

Motivation. As for the WCS, we assume that the value x% is the contrast 
perceived by z-th participant and has the distribution with the density Pl(c), 
as it is in Sorkin et al. (2001 ). The group possesses complete knowledge about 



the characteristics of its members and their perception, so its effectiveness is 
hindered only by the skill of the participants, not by communication. This 
model constitutes the upper bound for group performance, provided that the 
stimuli are fully defined by their contrast values (and perceived according to 
the discussed model). In the case of more complex, non-perceptive task it is 
possible for a group to exceed this bound Hill ( 1982 ) — for example when 



participant's skills complement each other. People know the strength of the 
stimuli, but also their own sensitivity. If the feedback is provided, one can 
plot x versus c to get a. 
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Results. The final group decision follows the standard derivation of n clas- 



sifiers collecting independent results with normal distribution (eg. Sorkin 



et al. (2001) and Bahrami et al. (2010)): 



Pdss(c) 

sdss 
boss 



1 



normalization 



si + . . . + si 



s\bi + ... + sjb n 
s\ + ... + s 2 n 



P[(x)-...-P' n {x)dx 



n x 



n 



(29) 

(30) 
(31) 



Note that regardless of the distribution of the individual performances, the 
group performance outscores both Best Decides and Weighted Confidence 
Sharing. 

3.6. Truth Wins 

Model. We assume that on each trial each member is in one of the two states: 
either knowing the right answer or being aware of his/her ignorance. In the 
latter case a random guess is made. So it is sufficient to have a single group 
member to perceive the stimuli correctly in order to get the correct group 
answer. We assume no bias as there is no possible way to treat it consistently, 
as it introduces false convictions. 

Motivation. For so called eureka-type problems the signal-theoretic limit can 



be exceeded Hill (1982). The key is that the answer to such a problem has 
the property of demonstrability — it allows a single member, who figured 
out the answer, to easily convince the rest of the group about its correctness 



Laughlin et al. (1975). People may know if they see the contrast stimuli 
(and all errors are only due to guessing, not to false observations). The 
model has received much attention in the group decision theory, e.g. in 



Davis (1973). It is appropriate for situations, when the correctness of a 



solution can be demonstrated. However, we don't expect this model to be 



realized tasks similar to that of Bahrami et al. (2010 ). It serves as a reference 



and an explicit example of a result beyond one provided by the Direct Signal 
Sharing model, we included it with the aim of generalization of the models 
to different decision situations. 

Results. The chance the responder knows with certainty the right answer is 

R{c) = \2P(c) - 1| . (32) 
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That is, it is a reversed formula saying that when one knows answer with 
probability R(c) then effectively answers correctly with probability R(c) + 
(1 — R(c)) /2 (as there is chance to answer correctly by a random guess). The 
probability that at least one person knows the correct answer is 

Rtw(c) = 1 - [1 - Ri(c)\ • . . . • [1 - R n (c)] . (33) 

Consequently, 

p , \ sign(c) Rtw(c) + 1 . , 

Ptw{c) = (34) 

si + ... + s n 

stw = n x , (35) 

n 

where the slope is a result of the straightforward differentiation ([7]). 

The model yields much better result than other models; note however 
that the absence of false observations is the strong requirement. Other mod- 
els have to operate without this assumption. Note that the Ptw( c ) is not 
normal. 



4. Aggregation of information in hierarchical schemes 

So far we have assumed that information from all participants is simul- 
taneously collected and used in the group decision. One may argue that 
this is unrealistic model of human communication for groups of more than 
a few people. We therefore propose hierarchical models (schemes) in which 
only small subgroups can communicate at a time. Each of these subgroup 
reaches its own decision, in a way described by one of the models introduced 
in the previous section. Hence, it can be regarded as a decision-making 
agent, described by slope and bias. The subgroup can then communicate 
with other subgroups or individual members, which results in larger groups 
being created, until all information is gathered and the final decision is made. 

The results of employing a multi-level decision system can significantly 
deviate from what simultaneous information collection predicts. For instance 
in a two-level voting system, which has been widely studied in the context 



of election results (e.g. Davis (1973); Laughlin et al. (1975)), the final out 



come depends heavily on the distribution of votes in subgroups, sometimes 
allowing minority groups to overcome the majority, sometimes exaggerating 
the power of the majority. It is thus interesting to study the possible effects 
of such hierarchical systems. 

Let's propose the following model for communication of n participants: 
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1. At the beginning there are n agents. 

2. Each turn only g (for our purpose: 2 or 3) agents (groups or individ- 
uals) share their information according to a chosen model. Then they 
are merged into one agent (defined by s mo d e i(si, • ■ • , s g )). 

In other words, a group of people, who shared information, is treated as a 
single agent in the next turn. There are two free parameters: 

• Model used to combine members' parameters into group parameters. 

• Structure in which groups are formed, i.e. the way to determine, which 
agents should interact in given turn. 

Let's consider following ways of the group forming (see Fig. [2] for the 
diagram of the two first schemes): 



4.1 Shallow hierarchy: Each turn g agents from the groups with the 



least number of participants interact. 



4.2 Deep hierarchy: Each turn g — 1 agents join to the group with 



largest number of participants. 



4.3 Random hierarchy: Each turn g random agents interact. 



o o o o 




Shallow Scheme Deep Scheme 

Figure 2: Diagram of the interaction ordering for aggregation schemes for g = 2: Shallow 
Scheme — each turn two agents from the least numerous groups interact, Deep Scheme — 
each turn a single participant joins the previously formed group. 



For some models the way in which groups are formed is irrelevant for 
obvious reasons. It is the case for Random Responder, Best Decides, Direct 
Signal Sharing and Truth Wins. The result is always the same and equivalent 
to the simplest situation without any hierarchy. Models, which are affected 
to a certain degree, are: Weighted Confidence Sharing and Voting. 

Note that agents in principle do not know their slopes, so the order of 
interactions cannot depend on their individual (or group) Sj. However, as 
both swcs anci s Vot depend linearly on Sj, the averaging over every per- 
mutation of participants yield in the result, which is proportional to the 
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arithmetical mean of Sj, or (s). Consequently, to investigate the influence 
of the hierarchical information- aggregation on the result, it suffices to treat 
each participant as if its performance equals to (s). 

For our convenience we consider a more general model with the parameter 
a g (the amplification multiplier) depending on g (the group size): 

s ag (si,--- ,s g ) = a g Sl+ +Sg - ( 36 ) 

It covers both WCS (02 = y/2, a 3 = . . .) and Voting (a 3 = 3/2, . . .) 
models and allows us to give results in the elegant general form. 

4-1. Shallow hierarchy 

The justification of the Shallow hierarchy is the following: people may lo- 
cally find their partners and then make a collective decision. Then iteratively 
groups of the same (or similar) size make the collective decision. 

The analysis is simple for the number of participants being a power of g, 
i.e. n = g k where A; is a natural number. Then, each a few elementary steps 
the number of agents is reduced by the factor of g, and agents' slopes are 
multiplied by the factor a g . In the end we get 

s ag ,Shallo W , g = (a g ) k (s) = n l °z 9 M(s). (37) 

In particular for the Weighted Confidence Sharing (i.e. a g = y/g) we 
reach the saturation 



s WCS,Shallow,g — V"-( s ) (38) 

and thus the aggregation process does not introduce even the slightest de- 
crease in the group performance, comparing with collecting all information at 



once. The formula (38) holds only for n that is a power of k. However, for dif- 
ferent ns the formula works as a very good approximation — see Fig. |3]for the 
numerical results. The relation (i.e. that for groups of size n = g k we reach 
the efficiency of model without aggregation or s ag Shallow, g = Shallow) is 
true for every model described by (36) with a g = g a for any a. 



In the Voting model we need to consider the aggregation in the group 
of at least three (i.e. g = 3 and a g = 3/2) — otherwise it is the Random 
Responder model. For n being the power of three we get 

s V ot,Shaiio W , 9 =3 = n lo ^ 3 / 2 (s) « n - 37 (s), (39) 

which works as a good approximation also for the general odd n. For every 
even n there is at least one process with two parties, significantly decreasing 
the total performance (as voting for two participants reduces to a coin flip). 
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4-2. Adding one or two at a time 

In this case, there is a single group to which single agents join one after 
another. The resulting slope is for the Weighted Confidence Sharing model 

n-1 

SWOS,Deep,g=2 = 2-^' 2 {s) + ]T 2~ i / 2 {s) = (l + yfe - 2 1 ~ n / 2 ) (s) (40) 

i=l 

and for the Voting model for an odd n and aggregation of three 

(n-l)/2 

SVot,Dee P<9 =3 = ^ n - 1 ^ 2 ( S )+2 £ 2~\ S ) = (2 - ' 2 ) {s) (41) 

i=l 

We see that the Deep hierarchy is very inefficient — the multiplier of (s) 
converges to a constant. This leads to a conclusion, that the simultaneous 
aggregation (Shallow hierarchy) is not only more natural, but also much 
more efficient. 

To obtain the asymptotic value of s agt Deep,g one can consider an equilib- 
rium situation when g — 1 individuals join the group, which already reached 
the limit 



9 



( s ) + ^Sa^Deep^ (42) 



leading to 



Sa g ,Deep,g = J \ i s ) ■ ( 43 ) 
g I a g 1 



4-3. Random hierarchy 

But what happens between the Shallow hierarchy and the Deep hierar- 
chy? If the groups merge at random, is the final s closer to the most efficient 
aggregation scheme, or non-scaling as in adding a few at a time? The answer, 
not surprisingly, lies between. 

We parameterize time with t starting from 0. Each turn g agents merge 
into one of the slope (36). The current number of agents is described by 
nt = no — (g — l)t. We investigate how the density function of slopes pt(s) 
evolves with time, which reads 

p t+ i{s) - p t (s) = (44) 

p( s ) , f P( s i) P( s g) x/ 1 \ w A 

+ / {s mod ei{si, ■■■ , Sg) - S) dS! ■ ■ ■ ds g , 



n t J n t n t 
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where 5 is the Dirac delta. The difference in distributions pt + i(s) — Pt(s) 
involves two processes. The first expression means that we take g random 
agents, so they interact and removed from the distribution). The second — 
for every possible group of g agents (with slopes si, . . . , s n ) there is created 
one with the slope s mode i(si, ■■■ ,s g ). 

Note that we use integrals, but sum over a finite set will give the same 
result. The parameter we care the most is the mean slope, that is 

(s) t = nt 1 / sp t (s)ds. (45) 



Let's multiply (44) by s and integrate J -ds. For our case (36) it gives a 
relatively simple results nt+i (s)t+l = n t{s)t — d( s )t + a g(s)t or 

/ \ n ~ (f ~ + a 9 - 1 l \ (ac\ 
S ( = / TvT s t-1- 46 

no - {g - l)t 

To obtain the final result we need to calculate {s)t max for such time that 
the only one agent remains. We consider t max = (no — l)/(<7 — 1) to be an 
integer (e.g. for g = 3 it means that we need to consider an odd number of 
participants, for g = 2 there are no restrictions). Then, remembering that 
(s)o = (s) and no = n we get 

n' nax f n - (g- l)t + a g - 1\ 
( no-ig- l)t ) {S) (47) 



t=i 



r A rw,+ 



9-1/ 3 _1 



U-iJ U-i 



( S ) (48) 



r 4i , 

x n K-i)(s-i) x / s \ (49) 

T (p,) (g - l)^-m 9 -D 



where T(x) is the Euler gamma function, and we applied the Stirling ap- 
proximation. For g = 2 we obtain the neat result 

S ag ,Random,g=2 ~ t^)" 02 " 1 ' ( 50 ) 
in particular for the Weighted Confidence Sharing model («2 = \/2) we get 

SWCS,Random,g=2 ~ 1. 13n ' 41 (s) , (51) 
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whereas for the Voting model for g = 3 (and odd number of participants) we 
get 

SVot,Random,g=3 ~ 1 .22n°' 25 (s) . (52) 

The numerical results, along with their approximations, are plotted in Fig. 

El 



multiplier multiplier 




Weighted Confidence Sharing, g = 2 Voting, g = 3 

Figure 3: Plot of numerically obtained multipliers of (s) for models with aggregation of 
information. Weighted Confidence Sharing with g — 2 for aggregation hierarchies: Shallow 
(circles), Deep (diamonds) and Random (squares). Voting with g = 3, and only for odd 
number of participants, for aggregation hierarchies: Shallow (circles), Deep (diamonds) 
and Random (squares). The lines are the respective analytical results from Sec. H] 



5. Conclusion and further remarks 

In the paper we examined mathematical models of solving a two-choice 
discriminative task by a group of participants. We were interested how 
the group performance depends on the performance of individuals, ways 
of communication and modes of decision aggregation. As a marker of the 
performance we used the slope of the psychometric function (|3]), which says 
how the performance changes with the difficulty of the task. The higher 
slope s, the better performance of an individual (or a group). 

We analyzed a number of models, also modifying them by allowing in- 
teraction of only a few people at once. Some of the models can be always 
consider a strategy for the group decision-making: the Random Responder 
and the Voting. For the Best Decides one need to assume that the group 
posses information indicating who performs better (e.g. from the feedback). 
Other models (i.e. the Weighted Confidence Sharing, the Direct Signal Shar- 
ing and the Truth Wins) have direct assumptions on the problem structure 
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or information that can be shared. Consequently, only in a subset of two- 
choice discriminative tasks they can by adopted. The list of the models is 
by no means exhaustive. 

For each investigated model we arrived at the formula for the slope of a 
group as a function of individual slopes: 

) = multiplier m0(fe/ (n) x mean mode j(si, ... ,s n ), (53) 

where the explicit results are placed in the Tab. [I] and Fig. [4] Note that the 
formula has two parts as factors — the part related to how the group size 
affects the performance, and the mean of the individual slopes (if the better- 
performing contribute more to the outcome) . For equally skilled participant 
only the multiplier matters, whereas for a group of people with the high 
variance of performances, the type of mean is crucial. 

We not only solved the problem for a particular list of models, but we 
constructed a general framework for the collaborative solving of a two-choice 
task, i.e. the group performance can be written down as 

Smodel(si,...,S n ) = dxn a X — -J , (54) 

where parameters d, a and p can be fitted for any experimental data, even 
not covered by models we investigated. Note that for p = 1 we arrive at 
the arithmetic mean, for p = 2 — the quadratic mean, and p — > oo — the 



maximum. For the models we investigated (54) is either an exact solution 
(RR, WCS, BD, DSS, TH) or a good approximation (Voting, information 
aggregation schemes). If the result is exact, then d = 1 (to be consistent 
with the case of n = 1). 

For a given list of slopes (s\, . . . , s n ) its possible to write relations with 
the performances (slopes) for different models, which reads 

srr < s Vo t < swcs < sdss < s T w- (55) 

An average-performing participant is expected to benefit from participating 
in a joint task solving, unless the responder is chosen at random, in which 
case there is neither gain nor loss. It is somewhat more difficult to relate 
the Best Decides model, as it highly depends on the distribution of the 
participants' skills. We can write 

SRR < S B D < SDSS < STW, (56) 

but how the Best Decides model relate to the Voting and the Weighted 
Confidence Sharing? The answer lies in the comparison of the most skilled 
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participant to the average performance, i.e. max(s)/(s). If it is greater that 
~ 0.8^/n, the Best Decides model outperforms the Voting. If the ratio is 
greater that ^/n — it outperforms the WCS as well. For example, when 
there is one expert (with s exp > 1 among s non - exp = 1) among the total 
number of n participants, then only when s exp > \fn + 1 its better for a 
group to use the Best Decides strategy. 



Model 


s(si,s 2 ) 


S(S1, S2, S3) 


Mean 


Multiplier 


RR 


Sl+S 2 
2 


S1+S2+S3 

3-3-,- 


arithmetic 


1 


Vot 


Sl+S 2 
2 


S1+S2+S3 
2 


arithmetic 


» 0.8Vn 


BD 


max(si, s 2 ) 


max(si,s 2 ,s 3 ) 


maximum 


1 


WCS 


v^2 


S1+S2+S3 

V3 


arithmetic 


\/n 


DSS 


V4 + 4 


V 4 + 4 + s l 


quadratic 


\fn 


TW 


Si + S 2 


Si + S 2 + s 3 


arithmetic 


n 



Table 1: Models summary for the six considered models [3] For each model there is given 
explicit formula for two and three members. In each model the s mo dei has the general 
form multiplier x mean. 



multilpier 

6r 
5 ; 
4 ; 
3 1 



2 : 
1 1 



▲ ♦ 
. t . 
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Figure 4: Plot summarizing multipliers for different models. 

For schemes of aggregation (Tab. [2| we obtained two interesting results. 
First, most of models we investigated are completely not affected by gradual 
aggregation of information. Second, for models that are affected, the optimal 
solution is also the one with the least effort — one need to group information 
in the smallest possible groups, i.e. in g = 2 for Weighted Confidence Sharing 
and g = 3 for Voting. 
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Model 


9 


Shallow hierarchy 


Random hierarchy 


Deep hierarchy 


Vot 


3 




1.22n u - 25 


2.00 


Vot 


4 


n 0. 18 


L15n u.u« 


1.36 


Vot 


5 


n 0.35 


1.38n - 19 


2.15 


wcs 


2 


n 0.5 


1.13n - 41 


2.41 


wcs 


3 


n 0.5 


1.25n - 37 


2.73 


wcs 


4 


n 0.5 


1.37n - 33 


3.00 


wcs 


5 


n 0.5 


1.48n u - 31 


3.23 



Table 2: Summary of information-aggregation results (see section j4j| in groups of g agents 
for affected models, i.e. Voting and Weighted Confidence Sharing. For each model there 
are provided asymptotic multipliers for three different information-aggregation hierarchies. 
In each model the s mot j e ; has the form multiplier times arithmetic mean. Note that for 
Voting grouping in g = 4 is very ineffective (as, in fact, it effectively uses opinions of three 
out of four participants). Also note that asymptotically the most effective approach (i.e. 
the best for very large groups) for the Shallow and Deep aggregation schemes is to gather 
information in the smallest possible groups of agents (i.e. in g = 3 for Voting and g = 2 
for WCS). 



It is possible that the participants' strategy varies from trial to trial. In 
such situations the outcome would be a mixture of strategies (with weights 
Wmodel), that is 



P eff(c) = ^ W mod elPmodel(c), (57) 
models 

&eff / j IV model S model- 

(58) 

models 

In order to distinguish between models the sole analysis of performance might 
be not enough, as (psychologically) different models of problem-solving can 
yield in the same performance. One can test modified schemes that put 
additional constraints on participant interaction in order to investigate the 
communicational aspect directly. For example, contact with other members 
could be limited to voice or text chat communication, or there may be no 
feedback provided. In addition, the participants might be asked to express 
their confidence explicitly on a Likert scale. However, further experimental 
work should be carried out to clarify if the confidence is subjectively accessi- 
ble and communicated explicitly, or rather read from participant's behavior. 
The amount of feedback could be ranging from full information about the 
stimulus, through simple information about the correctness, to no feedback 



at all. As a reference there may serve Social Decision Scheme Theory Davis 



(1973), where the group decision is considered as a function of individual 
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choices, regardless of their skill, confidence or difficulty of the task. 

In all models interaction is beneficial for the performance, except for 
the Random Responder model (where the performance is the same as the 



averaged performance of each individual). It may be possible as well Grof- 



man 



(1978) that beyond a certain critical size the groups start performing 
worse instead of better. Models we consider do not predict such collapse, as 
they are based on information-sharing and does not incorporate phenomena 
related to motivation and social or technical ability to work in a group. 

One needs to be aware of the fact, that the presented models are valid 
only for a specific situation of the collaborative solving of a two-choice per- 
ceptive task, where the difficulty can be smoothly adjusted. Some other 
tasks may be analyzed within the same paradigm, like integrating informa- 
tion in one mind, i.e. several exposures to the same stimuli by one person, 
perhaps with different senses or with different noise levels, a similar experi- 
ment is described in Ernst and Banks (2002). Perhaps, collaborative solving 
of other two-choice task (e.g. verbal or mathematical) can be treated in a 
similar way. However, for many other settings more advanced models are 
needed, e.g. the ones taking into account more choices or the dynamical in- 
teraction between solving a problem in one's mind and communication with 
the other participants. Nevertheless, the authors believe that the first step 
should be to experimentally verify the predicted results of this paper (with 
the emphasis of the scaling of the performance), before proceeding to more 
advanced theoretical models. 
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Appendix A. Approximations 

P(c) can be expanded in Taylor series of c around c = —b. 

P(c) = P[-b + (c + b)] (A.l) 

= P(-b) + (c + b)P'(-b) + (£±^p"(-6) + i^p"'(-b) + . . . 

(A.2) 

where pW(— 6) can be found explicitly using ([I]), 

P«(c) = -L#«(c±*>). (A.3) 

In particular H(0) = 1/2, H'(0) = 1/v^tt, H"(0) = 0, H"'(0) = -2/V2n. 
In general, making use of Hermite polynomials, 

H (i+l) = ( A .4) 

for odd i > and = for even % > 0. 
Consequently, 

P(c) = l+ i ^p 1 + 0[(^f], (A.5) 
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that it, the approximation error of taking the linear approximation is of the 
order (c + 6) 3 /cr 3 as the quadratic term vanishes. Plugging c = we obtain 

P^)= l - + -±^- + [{ b -f] (A.6) 
= \ + sb + 0[(sbf] (A.7) 



and similarly, the derivative of (A. 5) in is 



^'-° = vk + vk ^ (A - 8) 



= s [1 + 0(s 2 b 2 )] . (A.9) 

The last equation gives the approximate equation for slope ([7]). Another 
expression 

P'(C)\ C=0 ~ 1 + 0[(8bf] ~ + Q[Sb )J (A - 10) 

yields in the approximate equation for bias ([7]). 
Appendix B. Voting 



Pvot(c) = ^ E[ 1 - P h(c)]---[1-^(c)]^ +1 (c)---P Jii (c) (B.l) 

=1 i 

+ 2 E t 1 " P ^ ^ M P ^ /2+1 W • • • ^ (c) 

if n is even 



After plugging the linearization (A. 5) in the above, and using /ij = Sj(6j+c) 
each part has the form of 



(B.2) 

I + + OG< +1 )J ■ • • [f + W» + O 04L)] ( B - 3 ) 

~ (Mil + ■ ■ • + A*f fc ) + (Mifc+i + • • • + Mi„) (B.4) 
^ , , ^..2 A (B.5) 



+ 0(^) + ... + 0(^ r 2 
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After applying permutations to the main part (i.e. without the error esti- 
mation) we get 



1 In 



2 n \k 



+ 



1 In 



m— 1 



[-k + (n - k)] 



jll + . . . + jJL n 



n 



1 i n\ n 
+ 



2 n \k 2"" 1 



n — 1 
k - 1 



+ 



n — 1 
k 



Hi + . . . + fJL n 



n 



(B.6) 
(B.7) 



which is easy to be summed. The first component sums to 1/2. In the second, 
binomial coefficients cancel pairwise, except for (qZ 1 ) = and (i( n ""]V2j) 
as some of the elements cancel, leaving only ( n ^, 1 ) for k = [(n — l)/2j. 
Consequently, when n is odd, one gets 



, . 1 n 
^Vot,odd{c) — ~ + 



2 2 



n-l 



n-l \/j,i + ... + (in 
(n-l) A 



+ o{h\) + ... + o(hI) 



(B.8) 



and for even n 



Pyogenic) = - + _ _ g) /2 J] " + 0(Mi) + • • • + OW- 

(B.9) 



After the differentiation one obtains the slope ([T]) and the bias (J8j). 
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